Overview

Dataset statistics

Number of variables12
Number of observations2166
Missing cells402
Missing cells (%)1.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory203.2 KiB
Average record size in memory96.1 B

Variable types

Numeric11
Categorical1

Warnings

Year has constant value "0.0" Constant
df_index is highly correlated with ASGS_2016High correlation
ASGS_2016 is highly correlated with df_indexHigh correlation
Houses - median sale price ($) is highly correlated with RENT_2 and 1 other fieldsHigh correlation
HHTYPE_6 is highly correlated with TENURE_3High correlation
RENT_2 is highly correlated with Houses - median sale price ($) and 1 other fieldsHigh correlation
RENT_3 is highly correlated with Houses - median sale price ($) and 1 other fieldsHigh correlation
TENURE_2 is highly correlated with TENURE_4High correlation
TENURE_3 is highly correlated with HHTYPE_6 and 1 other fieldsHigh correlation
TENURE_4 is highly correlated with TENURE_2 and 1 other fieldsHigh correlation
MARRIAGE_2 is highly correlated with MARRIAGE_4High correlation
MARRIAGE_4 is highly correlated with MARRIAGE_2High correlation
df_index is highly correlated with ASGS_2016High correlation
ASGS_2016 is highly correlated with df_indexHigh correlation
Houses - median sale price ($) is highly correlated with RENT_2 and 1 other fieldsHigh correlation
HHTYPE_6 is highly correlated with TENURE_3High correlation
RENT_2 is highly correlated with Houses - median sale price ($) and 1 other fieldsHigh correlation
RENT_3 is highly correlated with Houses - median sale price ($) and 1 other fieldsHigh correlation
TENURE_2 is highly correlated with TENURE_4High correlation
TENURE_3 is highly correlated with HHTYPE_6 and 1 other fieldsHigh correlation
TENURE_4 is highly correlated with TENURE_2 and 1 other fieldsHigh correlation
MARRIAGE_2 is highly correlated with MARRIAGE_4High correlation
MARRIAGE_4 is highly correlated with MARRIAGE_2High correlation
df_index is highly correlated with ASGS_2016High correlation
ASGS_2016 is highly correlated with df_indexHigh correlation
Houses - median sale price ($) is highly correlated with RENT_2 and 1 other fieldsHigh correlation
HHTYPE_6 is highly correlated with TENURE_3High correlation
RENT_2 is highly correlated with Houses - median sale price ($) and 1 other fieldsHigh correlation
RENT_3 is highly correlated with Houses - median sale price ($) and 1 other fieldsHigh correlation
TENURE_3 is highly correlated with HHTYPE_6High correlation
MARRIAGE_2 is highly correlated with MARRIAGE_4High correlation
MARRIAGE_4 is highly correlated with MARRIAGE_2High correlation
TENURE_2 is highly correlated with HHTYPE_6 and 3 other fieldsHigh correlation
HHTYPE_6 is highly correlated with TENURE_2 and 4 other fieldsHigh correlation
MARRIAGE_2 is highly correlated with ASGS_2016 and 2 other fieldsHigh correlation
ASGS_2016 is highly correlated with MARRIAGE_2 and 3 other fieldsHigh correlation
MARRIAGE_4 is highly correlated with MARRIAGE_2 and 2 other fieldsHigh correlation
RENT_3 is highly correlated with TENURE_2 and 7 other fieldsHigh correlation
df_index is highly correlated with MARRIAGE_2 and 4 other fieldsHigh correlation
RENT_2 is highly correlated with HHTYPE_6 and 5 other fieldsHigh correlation
TENURE_3 is highly correlated with TENURE_2 and 4 other fieldsHigh correlation
TENURE_4 is highly correlated with TENURE_2 and 4 other fieldsHigh correlation
Houses - median sale price ($) is highly correlated with RENT_3 and 1 other fieldsHigh correlation
Houses - median sale price ($) has 375 (17.3%) missing values Missing
df_index has unique values Unique
ASGS_2016 has unique values Unique

Reproduction

Analysis started2021-08-18 15:16:22.309394
Analysis finished2021-08-18 15:16:44.238934
Duration21.93 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct2166
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.574609941 × 10-16
Minimum-1.72035955
Maximum1.792789285
Zeros0
Zeros (%)0.0%
Negative1091
Negative (%)50.4%
Memory size17.0 KiB

Quantile statistics

Minimum-1.72035955
5-th percentile-1.551222144
Q1-0.85933126
median-0.01211010551
Q30.8535205625
95-th percentile1.574176311
Maximum1.792789285
Range3.513148834
Interquartile range (IQR)1.712851822

Descriptive statistics

Standard deviation1.00023092
Coefficient of variation (CV)6.352245685 × 1015
Kurtosis-1.175878038
Mean1.574609941 × 10-16
Median Absolute Deviation (MAD)0.8568094428
Skewness0.02412642109
Sum3.410605132 × 10-13
Variance1.000461894
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.83056639511
 
< 0.1%
-1.014661531
 
< 0.1%
-1.1665400171
 
< 0.1%
0.399802761
 
< 0.1%
-0.64493713351
 
< 0.1%
-1.6191072251
 
< 0.1%
-1.3245550081
 
< 0.1%
-0.21538181751
 
< 0.1%
0.13593306591
 
< 0.1%
1.3877799871
 
< 0.1%
Other values (2156)2156
99.5%
ValueCountFrequency (%)
-1.720359551
< 0.1%
-1.7188254231
< 0.1%
-1.7172912971
< 0.1%
-1.7157571711
< 0.1%
-1.7142230451
< 0.1%
-1.7126889191
< 0.1%
-1.7111547931
< 0.1%
-1.7096206671
< 0.1%
-1.7080865411
< 0.1%
-1.7065524141
< 0.1%
ValueCountFrequency (%)
1.7927892851
< 0.1%
1.7912551591
< 0.1%
1.7897210321
< 0.1%
1.786652781
< 0.1%
1.7759138971
< 0.1%
1.7743797711
< 0.1%
1.7728456451
< 0.1%
1.7713115191
< 0.1%
1.7697773931
< 0.1%
1.7682432671
< 0.1%

ASGS_2016
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct2166
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.248699802 × 10-17
Minimum-1.117431002
Maximum3.099693622
Zeros0
Zeros (%)0.0%
Negative1356
Negative (%)62.6%
Memory size17.0 KiB

Quantile statistics

Minimum-1.117431002
5-th percentile-1.091126467
Q1-0.9751546823
median-0.0524354762
Q30.4850041707
95-th percentile2.050697277
Maximum3.099693622
Range4.217124624
Interquartile range (IQR)1.460158853

Descriptive statistics

Standard deviation1.00023092
Coefficient of variation (CV)1.905673706 × 1016
Kurtosis0.2674733189
Mean5.248699802 × 10-17
Median Absolute Deviation (MAD)0.5379142665
Skewness0.9873775957
Sum1.136868377 × 10-13
Variance1.000461894
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.026452300051
 
< 0.1%
-1.0910209731
 
< 0.1%
-1.0014592361
 
< 0.1%
-1.0225977521
 
< 0.1%
-1.0910209991
 
< 0.1%
-0.98032073111
 
< 0.1%
-0.5374767881
 
< 0.1%
2.5726657111
 
< 0.1%
-0.54812523191
 
< 0.1%
-1.0224922681
 
< 0.1%
Other values (2156)2156
99.5%
ValueCountFrequency (%)
-1.1174310021
< 0.1%
-1.1174309971
< 0.1%
-1.1174309921
< 0.1%
-1.1174309871
< 0.1%
-1.1174309811
< 0.1%
-1.1174309761
< 0.1%
-1.1173782571
< 0.1%
-1.1173782521
< 0.1%
-1.1173782471
< 0.1%
-1.1173782411
< 0.1%
ValueCountFrequency (%)
3.0996936221
< 0.1%
3.0996409031
< 0.1%
3.0995881851
< 0.1%
2.572982061
< 0.1%
2.5728764751
< 0.1%
2.572876471
< 0.1%
2.5728764651
< 0.1%
2.5728764591
< 0.1%
2.5728764541
< 0.1%
2.5728764491
< 0.1%

Year
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size127.0 KiB
0.0
2166 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6498
Distinct characters2
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.02166
100.0%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
0.02166
100.0%

Most occurring characters

ValueCountFrequency (%)
04332
66.7%
.2166
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number4332
66.7%
Other Punctuation2166
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04332
100.0%
Other Punctuation
ValueCountFrequency (%)
.2166
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common6498
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
04332
66.7%
.2166
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII6498
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
04332
66.7%
.2166
33.3%

Houses - median sale price ($)
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct929
Distinct (%)51.9%
Missing375
Missing (%)17.3%
Infinite0
Infinite (%)0.0%
Mean-1.428226605 × 10-16
Minimum-1.366177591
Maximum8.648440852
Zeros0
Zeros (%)0.0%
Negative1151
Negative (%)53.1%
Memory size17.0 KiB

Quantile statistics

Minimum-1.366177591
5-th percentile-0.9635239339
Q1-0.6040818885
median-0.2505323357
Q30.2588409089
95-th percentile1.930671989
Maximum8.648440852
Range10.01461844
Interquartile range (IQR)0.8629227974

Descriptive statistics

Standard deviation1.000279291
Coefficient of variation (CV)-7.003645552 × 1015
Kurtosis12.1696514
Mean-1.428226605 × 10-16
Median Absolute Deviation (MAD)0.3980706076
Skewness2.749068572
Sum-2.557953849 × 10-13
Variance1.000558659
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0846850180416
 
0.7%
-0.57003637615
 
0.7%
-0.439092097214
 
0.6%
-0.517658664514
 
0.6%
-0.373619957813
 
0.6%
-0.491469808713
 
0.6%
0.137062729612
 
0.6%
-0.674791799112
 
0.6%
-0.622414087512
 
0.6%
-0.255770106912
 
0.6%
Other values (919)1658
76.5%
(Missing)375
 
17.3%
ValueCountFrequency (%)
-1.3661775911
< 0.1%
-1.3164187651
< 0.1%
-1.3098715511
< 0.1%
-1.2797543671
< 0.1%
-1.2614221681
< 0.1%
-1.2457088551
< 0.1%
-1.2149369491
< 0.1%
-1.2090444571
< 0.1%
-1.1985689141
< 0.1%
-1.1939858651
< 0.1%
ValueCountFrequency (%)
8.6484408521
< 0.1%
8.3996467231
< 0.1%
7.4961311991
< 0.1%
5.8986109971
< 0.1%
5.3748338821
< 0.1%
5.1129453241
< 0.1%
4.890340051
< 0.1%
4.7201124882
0.1%
4.594405981
< 0.1%
4.490961
< 0.1%

HHTYPE_6
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct31
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-4.52700358 × 10-16
Minimum-2.80625907
Maximum7.542499185
Zeros0
Zeros (%)0.0%
Negative1293
Negative (%)59.7%
Memory size17.0 KiB

Quantile statistics

Minimum-2.80625907
5-th percentile-1.407778225
Q1-0.5686897177
median-0.00929737958
Q30.5500949585
95-th percentile1.668879635
Maximum7.542499185
Range10.34875825
Interquartile range (IQR)1.118784676

Descriptive statistics

Standard deviation1.00023092
Coefficient of variation (CV)-2.20947676 × 1015
Kurtosis5.150207831
Mean-4.52700358 × 10-16
Median Absolute Deviation (MAD)0.5593923381
Skewness1.180055613
Sum-9.805489753 × 10-13
Variance1.000461894
MonotonicityNot monotonic
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
-0.5686897177291
13.4%
-0.2889935486263
12.1%
-0.00929737958234
10.8%
-0.8483858867233
10.8%
0.2703987895206
9.5%
0.5500949585198
9.1%
0.8297911275151
7.0%
-1.128082056141
6.5%
1.109487297112
 
5.2%
1.38918346687
 
4.0%
Other values (21)250
11.5%
ValueCountFrequency (%)
-2.806259072
 
0.1%
-2.5265629012
 
0.1%
-2.24686673213
 
0.6%
-1.96717056321
 
1.0%
-1.68747439429
 
1.3%
-1.40777822564
 
3.0%
-1.128082056141
6.5%
-0.8483858867233
10.8%
-0.5686897177291
13.4%
-0.2889935486263
12.1%
ValueCountFrequency (%)
7.5424991852
 
0.1%
6.4237145081
 
< 0.1%
6.1440183391
 
< 0.1%
5.0252336632
 
0.1%
4.4658413251
 
< 0.1%
4.1861451562
 
0.1%
3.9064489872
 
0.1%
3.6267528182
 
0.1%
3.3470566492
 
0.1%
3.067360485
0.2%

RENT_2
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1133
Distinct (%)52.4%
Missing3
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean1.116895655 × 10-16
Minimum-2.62916074
Maximum3.957546159
Zeros0
Zeros (%)0.0%
Negative1081
Negative (%)49.9%
Memory size17.0 KiB

Quantile statistics

Minimum-2.62916074
5-th percentile-1.648284258
Q1-0.6654064991
median1.850465286 × 10-5
Q30.670446704
95-th percentile1.715864407
Maximum3.957546159
Range6.586706898
Interquartile range (IQR)1.335853203

Descriptive statistics

Standard deviation1.000231241
Coefficient of variation (CV)8.955458249 × 1015
Kurtosis-0.06423350917
Mean1.116895655 × 10-16
Median Absolute Deviation (MAD)0.6679266015
Skewness0.1155991035
Sum2.415845302 × 10-13
Variance1.000462535
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.25514446677
 
0.3%
0.172628757
 
0.3%
0.57288439147
 
0.3%
0.33773420217
 
0.3%
0.43779811247
 
0.3%
0.93811766417
 
0.3%
-0.57284738216
 
0.3%
-0.4177483216
 
0.3%
1.850465286 × 10-56
 
0.3%
-0.0049846908646
 
0.3%
Other values (1123)2097
96.8%
ValueCountFrequency (%)
-2.629160741
< 0.1%
-2.5966399691
< 0.1%
-2.5716239911
< 0.1%
-2.5516112091
< 0.1%
-2.5416048181
< 0.1%
-2.4990776561
< 0.1%
-2.416524931
< 0.1%
-2.4115217351
< 0.1%
-2.3815025621
< 0.1%
-2.3339722041
< 0.1%
ValueCountFrequency (%)
3.9575461591
< 0.1%
3.1370220941
< 0.1%
3.0394597811
< 0.1%
3.029453391
< 0.1%
2.7692872231
< 0.1%
2.7192552681
< 0.1%
2.6817313021
< 0.1%
2.6767281061
< 0.1%
2.6667217151
< 0.1%
2.6241945531
< 0.1%

RENT_3
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1228
Distinct (%)57.0%
Missing13
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean-1.980146964 × 10-16
Minimum-3.849943279
Maximum5.772769353
Zeros0
Zeros (%)0.0%
Negative1110
Negative (%)51.2%
Memory size17.0 KiB

Quantile statistics

Minimum-3.849943279
5-th percentile-1.480986661
Q1-0.7098343218
median-0.03591060531
Q30.6108892464
95-th percentile1.748005115
Maximum5.772769353
Range9.622712632
Interquartile range (IQR)1.320723568

Descriptive statistics

Standard deviation1.000232315
Coefficient of variation (CV)-5.051303428 × 1015
Kurtosis1.081703632
Mean-1.980146964 × 10-16
Median Absolute Deviation (MAD)0.6593185585
Skewness0.4631580527
Sum-4.263256415 × 10-13
Variance1.000464684
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.28540286947
 
0.3%
0.22698223767
 
0.3%
-0.44068212547
 
0.3%
0.16856160597
 
0.3%
0.2478467496
 
0.3%
0.1247461326
 
0.3%
-0.7286123826
 
0.3%
-0.75156334456
 
0.3%
0.59419763736
 
0.3%
-1.0436665036
 
0.3%
Other values (1218)2089
96.4%
(Missing)13
 
0.6%
ValueCountFrequency (%)
-3.8499432791
< 0.1%
-3.2240079391
< 0.1%
-2.8088041631
< 0.1%
-2.7545564341
< 0.1%
-2.3685629741
< 0.1%
-2.3393526581
< 0.1%
-2.2684133192
0.1%
-2.2621539661
< 0.1%
-2.1849552741
< 0.1%
-2.1453127021
< 0.1%
ValueCountFrequency (%)
5.7727693531
< 0.1%
5.3826029911
< 0.1%
4.9381888991
< 0.1%
4.4875154541
< 0.1%
3.6800588651
< 0.1%
3.4422034361
< 0.1%
2.8851209831
< 0.1%
2.8746887271
< 0.1%
2.7536745621
< 0.1%
2.7453287571
< 0.1%

TENURE_2
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct459
Distinct (%)21.2%
Missing2
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean1.838742754 × 10-16
Minimum-3.115559563
Maximum3.349415845
Zeros0
Zeros (%)0.0%
Negative1008
Negative (%)46.5%
Memory size17.0 KiB

Quantile statistics

Minimum-3.115559563
5-th percentile-1.74834422
Q1-0.6350402979
median0.08763066917
Q30.7419408691
95-th percentile1.445080188
Maximum3.349415845
Range6.464975408
Interquartile range (IQR)1.376981167

Descriptive statistics

Standard deviation1.000231134
Coefficient of variation (CV)5.439755678 × 1015
Kurtosis-0.1834578494
Mean1.838742754 × 10-16
Median Absolute Deviation (MAD)0.6933734954
Skewness-0.3732769706
Sum3.97903932 × 10-13
Variance1.000462321
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.18528890815
 
0.7%
0.566156039314
 
0.6%
-0.146749103913
 
0.6%
0.575921863113
 
0.6%
0.790769988513
 
0.6%
-0.127217456213
 
0.6%
0.0876306691713
 
0.6%
0.634516806412
 
0.6%
0.00950407813812
 
0.6%
0.0680990214112
 
0.6%
Other values (449)2034
93.9%
ValueCountFrequency (%)
-3.1155595631
< 0.1%
-3.056964621
< 0.1%
-3.0374329721
< 0.1%
-2.9983696771
< 0.1%
-2.9593063811
< 0.1%
-2.930008911
< 0.1%
-2.9007114381
< 0.1%
-2.8616481421
< 0.1%
-2.8225848471
< 0.1%
-2.7932873751
< 0.1%
ValueCountFrequency (%)
3.3494158451
< 0.1%
2.4607258721
< 0.1%
2.304472691
< 0.1%
2.2947068661
< 0.1%
2.2751752181
< 0.1%
2.2361119231
< 0.1%
2.2263460991
< 0.1%
2.2165802751
< 0.1%
2.1482195081
< 0.1%
2.1384536842
0.1%

TENURE_3
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct493
Distinct (%)22.8%
Missing7
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean4.681551837 × 10-16
Minimum-3.130457308
Maximum3.876674718
Zeros0
Zeros (%)0.0%
Negative1225
Negative (%)56.6%
Memory size17.0 KiB

Quantile statistics

Minimum-3.130457308
5-th percentile-1.351723794
Q1-0.6644858449
median-0.1479344455
Q30.5887127675
95-th percentile1.820350974
Maximum3.876674718
Range7.007132026
Interquartile range (IQR)1.253198612

Descriptive statistics

Standard deviation1.000231669
Coefficient of variation (CV)2.136538703 × 1015
Kurtosis0.7555211844
Mean4.681551837 × 10-16
Median Absolute Deviation (MAD)0.6018946741
Skewness0.467471108
Sum1.010747042 × 10-12
Variance1.000463392
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.327604497517
 
0.8%
-0.444390031216
 
0.7%
-0.60609307815
 
0.7%
-0.390489015715
 
0.7%
-0.686944601414
 
0.6%
-0.480324041614
 
0.6%
-0.273703481914
 
0.6%
-0.0131819065414
 
0.6%
-0.866614653314
 
0.6%
-0.0491159169413
 
0.6%
Other values (483)2013
92.9%
ValueCountFrequency (%)
-3.1304573081
< 0.1%
-3.1124903031
< 0.1%
-3.0945232981
< 0.1%
-3.0855397951
< 0.1%
-3.0765562922
0.1%
-3.0496057851
< 0.1%
-3.0406222821
< 0.1%
-3.0136717741
< 0.1%
-2.9957047691
< 0.1%
-2.9687542611
< 0.1%
ValueCountFrequency (%)
3.8766747181
< 0.1%
3.7329386771
< 0.1%
3.4634335991
< 0.1%
3.4544500961
< 0.1%
3.3915655781
< 0.1%
3.3825820751
< 0.1%
3.3735985731
< 0.1%
3.1849450181
< 0.1%
3.0681594851
< 0.1%
3.0591759822
0.1%

TENURE_4
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct542
Distinct (%)25.0%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean-1.312781036 × 10-16
Minimum-1.773699529
Maximum4.8835397
Zeros0
Zeros (%)0.0%
Negative1254
Negative (%)57.9%
Memory size17.0 KiB

Quantile statistics

Minimum-1.773699529
5-th percentile-1.273318802
Q1-0.6786634465
median-0.1710308256
Q30.4671358978
95-th percentile1.903011025
Maximum4.8835397
Range6.657239228
Interquartile range (IQR)1.145799344

Descriptive statistics

Standard deviation1.000231027
Coefficient of variation (CV)-7.619176386 × 1015
Kurtosis2.813739628
Mean-1.312781036 × 10-16
Median Absolute Deviation (MAD)0.558395883
Skewness1.324677877
Sum-2.842170943 × 10-13
Variance1.000462107
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.743930497814
 
0.6%
-0.0767561960314
 
0.6%
-0.366831979413
 
0.6%
-0.301564928113
 
0.6%
-0.526373660212
 
0.6%
-0.678663446512
 
0.6%
0.126296852312
 
0.6%
-0.0985118797812
 
0.6%
0.148052536112
 
0.6%
-0.337824401112
 
0.6%
Other values (532)2039
94.1%
ValueCountFrequency (%)
-1.7736995292
0.1%
-1.7301881611
 
< 0.1%
-1.7011805831
 
< 0.1%
-1.6866767941
 
< 0.1%
-1.6721730051
 
< 0.1%
-1.6359135321
 
< 0.1%
-1.6286616373
0.1%
-1.6141578481
 
< 0.1%
-1.5996540591
 
< 0.1%
-1.5778983751
 
< 0.1%
ValueCountFrequency (%)
4.88353971
< 0.1%
4.7965169651
< 0.1%
4.7239980191
< 0.1%
4.6877385461
< 0.1%
4.6804866511
< 0.1%
4.6442271781
< 0.1%
4.5934639162
0.1%
4.5499525491
< 0.1%
4.4701817081
< 0.1%
4.3106400271
< 0.1%

MARRIAGE_2
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1860
Distinct (%)85.9%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0
Minimum-1.611554573
Maximum3.547444515
Zeros0
Zeros (%)0.0%
Negative1259
Negative (%)58.1%
Memory size17.0 KiB

Quantile statistics

Minimum-1.611554573
5-th percentile-1.236112456
Q1-0.8054609654
median-0.2175408414
Q30.6179487723
95-th percentile1.949863441
Maximum3.547444515
Range5.158999088
Interquartile range (IQR)1.423409738

Descriptive statistics

Standard deviation1.000231027
Coefficient of variation (CV)nan
Kurtosis0.0328850444
Mean0
Median Absolute Deviation (MAD)0.666462578
Skewness0.8120292814
Sum0
Variance1.000462107
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.67685343834
 
0.2%
-0.96576106174
 
0.2%
-0.81969965594
 
0.2%
0.89996670684
 
0.2%
-0.18630758483
 
0.1%
-0.74942482863
 
0.1%
-0.50323327663
 
0.1%
-1.0121516343
 
0.1%
-1.0856416493
 
0.1%
-1.3065710093
 
0.1%
Other values (1850)2131
98.4%
ValueCountFrequency (%)
-1.6115545731
< 0.1%
-1.6069614471
< 0.1%
-1.6065021341
< 0.1%
-1.6032869461
< 0.1%
-1.6028276341
< 0.1%
-1.5963972572
0.1%
-1.5931820691
< 0.1%
-1.5904261931
< 0.1%
-1.575728191
< 0.1%
-1.5674605641
< 0.1%
ValueCountFrequency (%)
3.5474445151
< 0.1%
3.4771696881
< 0.1%
3.3338641581
< 0.1%
3.1781571871
< 0.1%
3.166215061
< 0.1%
3.1437087421
< 0.1%
3.1244176131
< 0.1%
3.083079481
< 0.1%
3.0582765991
< 0.1%
3.0297992181
< 0.1%

MARRIAGE_4
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1802
Distinct (%)83.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.624349901 × 10-17
Minimum-1.52034968
Maximum4.082065696
Zeros0
Zeros (%)0.0%
Negative1274
Negative (%)58.8%
Memory size17.0 KiB

Quantile statistics

Minimum-1.52034968
5-th percentile-1.161250202
Q1-0.8022794794
median-0.2608620162
Q30.6615433835
95-th percentile1.900043878
Maximum4.082065696
Range5.602415376
Interquartile range (IQR)1.463822863

Descriptive statistics

Standard deviation1.00023092
Coefficient of variation (CV)3.811347411 × 1016
Kurtosis0.213944171
Mean2.624349901 × 10-17
Median Absolute Deviation (MAD)0.6669542114
Skewness0.8851476318
Sum5.684341886 × 10-14
Variance1.000461894
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1.0949410476
 
0.3%
-1.053224225
 
0.2%
-1.0485890174
 
0.2%
0.012872472844
 
0.2%
-0.86060578394
 
0.2%
0.26420348074
 
0.2%
-0.43468212934
 
0.2%
-0.57064808434
 
0.2%
-0.49030456543
 
0.1%
-0.8570006263
 
0.1%
Other values (1792)2125
98.1%
ValueCountFrequency (%)
-1.520349681
< 0.1%
-1.5141694091
< 0.1%
-1.5095342061
< 0.1%
-1.5090191831
< 0.1%
-1.5079891381
< 0.1%
-1.5069590931
< 0.1%
-1.4992337551
< 0.1%
-1.4894483261
< 0.1%
-1.4879032581
< 0.1%
-1.4873882361
< 0.1%
ValueCountFrequency (%)
4.0820656961
< 0.1%
3.9007777561
< 0.1%
3.7936530641
< 0.1%
3.460433471
< 0.1%
3.4310771841
< 0.1%
3.3924504921
< 0.1%
3.3883303121
< 0.1%
3.3481585521
< 0.1%
3.2554544921
< 0.1%
3.1292739661
< 0.1%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexASGS_2016YearHouses - median sale price ($)HHTYPE_6RENT_2RENT_3TENURE_2TENURE_3TENURE_4MARRIAGE_2MARRIAGE_4
0-1.720360-1.1174310.0-0.543848-0.848386-1.028138-0.5032761.406017-0.408456-0.685915-1.056246-1.094426
1-1.718825-1.1174310.0NaN0.2703990.0050220.162302-0.4104260.3012410.061030-0.273577-0.135969
2-1.717291-1.1174310.0NaN-1.4077780.052552-0.133974-0.693635-0.3096370.699197-0.3112410.397594
3-1.715757-1.1174310.0NaN-1.1280820.0150280.364688-0.928015-0.2557360.982020-0.974029-0.695284
4-1.714223-1.1174310.00.2811010.8297910.3177211.0636490.2243521.487063-1.2660671.049243-0.027814
5-1.712689-1.1174310.0NaN1.1094870.7780151.188836-0.8694201.352310-0.3015650.696951-0.009789
6-1.711155-1.1173780.0NaN-1.128082-1.881183-1.7947891.747821-0.920516-0.577137-1.206900-1.180950
7-1.709621-1.1173780.0NaN-0.848386-0.920569-1.0895680.448966-0.417440-0.004237-0.591881-0.516571
8-1.708087-1.1173780.0NaN-0.568690-1.083173-0.7849471.650162-0.120984-1.092021-1.013070-1.143868
9-1.706552-1.1173780.0NaN-0.2889940.737990-0.3509650.253650-0.192852-0.178283-0.758152-0.775112

Last rows

df_indexASGS_2016YearHouses - median sale price ($)HHTYPE_6RENT_2RENT_3TENURE_2TENURE_3TENURE_4MARRIAGE_2MARRIAGE_4
21561.7682432.5728760.00.182893-1.1280820.2351690.594198-0.361597-1.0732351.170570-1.221139-1.002237
21571.7697772.5728760.00.359668-1.1280820.6129100.667223-0.058857-0.6869450.684693-1.145812-0.977001
21581.7713122.5728760.02.0030191.6688802.5441433.6800592.236112-0.085050-1.505379-1.409916-1.418890
21591.7728462.5728760.00.372762-0.5686900.7529990.9551540.751707-0.489308-0.076756-1.181638-1.115542
21601.7743802.5728760.0NaN-2.2468670.8855840.333391-1.445604-1.1540872.135072-1.371334-1.051679
21611.7759142.5728760.00.186822-0.0092970.6979640.7026930.507561-0.103017-0.185535-1.208737-1.203096
21621.7866532.5729820.0NaN0.550095-0.8530262.451139-1.523730-0.5521921.438890-1.553222-1.471423
21631.7897213.0995880.0NaN-0.009297-1.260787-0.3509650.009504-2.0704041.482401-1.382358-1.352967
21641.7912553.0996410.0NaN3.347057-1.545969NaN-2.197572-2.9687542.584689-1.479732-1.487903
21651.7927893.0996940.0NaN1.389183-1.440902NaN-2.822585NaN4.883540-1.596397-1.464727